Goto

Collaborating Authors

 interaction function




Data-driven Learning of Interaction Laws in Multispecies Particle Systems with Gaussian Processes: Convergence Theory and Applications

arXiv.org Machine Learning

We develop a Gaussian process framework for learning interaction kernels in multi-species interacting particle systems from trajectory data. Such systems provide a canonical setting for multiscale modeling, where simple microscopic interaction rules generate complex macroscopic behaviors. While our earlier work established a Gaussian process approach and convergence theory for single-species systems, and later extended to second-order models with alignment and energy-type interactions, the multi-species setting introduces new challenges: heterogeneous populations interact both within and across species, the number of unknown kernels grows, and asymmetric interactions such as predator-prey dynamics must be accommodated. We formulate the learning problem in a nonparametric Bayesian setting and establish rigorous statistical guarantees. Our analysis shows recoverability of the interaction kernels, provides quantitative error bounds, and proves statistical optimality of posterior estimators, thereby unifying and generalizing previous single-species theory. Numerical experiments confirm the theoretical predictions and demonstrate the effectiveness of the proposed approach, highlighting its advantages over existing kernel-based methods. This work contributes a complete statistical framework for data-driven inference of interaction laws in multi-species systems, advancing the broader multiscale modeling program of connecting microscopic particle dynamics with emergent macroscopic behavior.


Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

arXiv.org Machine Learning

We study the convergence rate of learning pairwise interactions in single-layer attention-style models, where tokens interact through a weight matrix and a non-linear activation function. We prove that the minimax rate is $M^{-\frac{2β}{2β+1}}$ with $M$ being the sample size, depending only on the smoothness $β$ of the activation, and crucially independent of token count, ambient dimension, or rank of the weight matrix. These results highlight a fundamental dimension-free statistical efficiency of attention-style nonlocal models, even when the weight matrix and activation are not separately identifiable and provide a theoretical understanding of the attention mechanism and its training.




Learning Swarm Interaction Dynamics from Density Evolution

arXiv.org Artificial Intelligence

We consider the problem of understanding the coordinated movements of biological or artificial swarms. In this regard, we propose a learning scheme to estimate the coordination laws of the interacting agents from observations of the swarm's density over time. We describe the dynamics of the swarm based on pairwise interactions according to a Cucker-Smale flocking model, and express the swarm's density evolution as the solution to a system of mean-field hydrodynamic equations. We propose a new family of parametric functions to model the pairwise interactions, which allows for the mean-field macroscopic system of integro-differential equations to be efficiently solved as an augmented system of PDEs. Finally, we incorporate the augmented system in an iterative optimization scheme to learn the dynamics of the interacting agents from observations of the swarm's density evolution over time. The results of this work can offer an alternative approach to study how animal flocks coordinate, create new control schemes for large networked systems, and serve as a central part of defense mechanisms against adversarial drone attacks.


Geometrically-Aware One-Shot Skill Transfer of Category-Level Objects

arXiv.org Artificial Intelligence

Robotic manipulation of unfamiliar objects in new environments is challenging and requires extensive training or laborious pre-programming. We propose a new skill transfer framework, which enables a robot to transfer complex object manipulation skills and constraints from a single human demonstration. Our approach addresses the challenge of skill acquisition and task execution by deriving geometric representations from demonstrations focusing on object-centric interactions. By leveraging the Functional Maps (FM) framework, we efficiently map interaction functions between objects and their environments, allowing the robot to replicate task operations across objects of similar topologies or categories, even when they have significantly different shapes. Additionally, our method incorporates a Task-Space Imitation Algorithm (TSIA) which generates smooth, geometrically-aware robot paths to ensure the transferred skills adhere to the demonstrated task constraints. We validate the effectiveness and adaptability of our approach through extensive experiments, demonstrating successful skill transfer and task execution in diverse real-world environments without requiring additional training.


Preventing the Popular Item Embedding Based Attack in Federated Recommendations

arXiv.org Artificial Intelligence

Privacy concerns have led to the rise of federated recommender systems (FRS), which can create personalized models across distributed clients. However, FRS is vulnerable to poisoning attacks, where malicious users manipulate gradients to promote their target items intentionally. Existing attacks against FRS have limitations, as they depend on specific models and prior knowledge, restricting their real-world applicability. In our exploration of practical FRS vulnerabilities, we devise a model-agnostic and prior-knowledge-free attack, named PIECK (Popular Item Embedding based Attack). The core module of PIECK is popular item mining, which leverages embedding changes during FRS training to effectively identify the popular items. Built upon the core module, PIECK branches into two diverse solutions: The PIECKIPE solution employs an item popularity enhancement module, which aligns the embeddings of targeted items with the mined popular items to increase item exposure. The PIECKUEA further enhances the robustness of the attack by using a user embedding approximation module, which approximates private user embeddings using mined popular items. Upon identifying PIECK, we evaluate existing federated defense methods and find them ineffective against PIECK, as poisonous gradients inevitably overwhelm the cold target items. We then propose a novel defense method by introducing two regularization terms during user training, which constrain item popularity enhancement and user embedding approximation while preserving FRS performance. We evaluate PIECK and its defense across two base models, three real datasets, four top-tier attacks, and six general defense methods, affirming the efficacy of both PIECK and its defense.


Nonparametric estimation of Hawkes processes with RKHSs

arXiv.org Machine Learning

Hawkes processes are a class of past-dependent point processes, widely used in many applications such as seismology [Ogata, 1988], criminology [Olinde and Short, 2020] and neuroscience [Reynaud-Bouret et al., 2013] for their ability to capture complex dependence structures. In their multidimensional version [Ogata, 1988], Hawkes processes can model pairwise interactions between different types of events, allowing to recover a connectivity graph between different features. Originally developed by Hawkes [1971] in order to model self-exciting phenomena, where each event increases the probability of a new event occurring, many extensions have been proposed ever since. In particular, nonlinear Hawkes processes have been introduced notably to detect inhibiting interactions, when an event can decrease the probability of another one appearing. Hawkes processes with inhibition are notoriously more complicated to handle due to the loss of many properties of linear Hawkes processes such as the cluster representation and the branching structure of the process [Hawkes and Oakes, 1974]. Since the first article on nonlinear Hawkes processes [Brémaud and Massoulié, 1996] proving in particular their existence, many works have focused on inhibition in the past few years. Among them, limit theorems have been established in [Costa et al., 2020] while Duval et al. [2022] obtained mean-field results on the behaviour of two neuronal populations. Regarding statistical inference, in the frequentist setting we can mention the exact maximum likelihood procedure of Bonnet et al. [2023], the least-squares approach by Bacry et al. [2020] and the nonparametric approach based on Bernstein-type polynomials by Lemonnier and Vayatis [2014]. While the first one proposes an exact inference procedure, it is restricted to exponential kernels.